Developing and Maintaining a WordNet: Procedures and Tools
نویسندگان
چکیده
In this paper we present a set of tools that will help developers of wordnets not only to increase the number of synsets but also to ensure their quality, thus preventing it to become obsolete too soon. We discuss where the dangers lay in a WordNet production and how they were faced in the case of the Serbian WordNet. Developed tools fall in two categories: first are tools for upgrade, cleaning and validation that produce a clean, up-to-date WordNet, while second category consists of tools gathered in a Web application that enable search, development and maintenance of a WordNet. The basic functions of this application are presented: XML support and import/export facilities, creation of new synsets, connection to the Princeton WordNet, sophisticated search possibilities and navigation, production of a WordNet statistics and safety procedures. Some of presented tools were developed specifically for Serbian, while majority of them is adaptable and can be used for wordnets of other lan-
منابع مشابه
Automatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملUsing a Lemmatizer to Support the Development and Validation of the Greek WordNet
In this paper we aim to give a description of the computational tools that have been designed and implemented to support the development and validation process of the Greek WordNet, which is currently being developed in the framework of the BalkaNet project. In particular, we focus on the description of a lemmatizer for the Greek language, which has been used as the basis for a number of tools ...
متن کاملحسنگار : شبکه واژگان حسی فارسی
Awareness of others' opinions plays a crucial role in the decision making process performed by simple customers to top-level executives of manufacturing companies and various organizations. Today, with the advent of Web 2.0 and the expansion of social networks, a vast number of texts related to people's opinions have been created. However, exploring the enormous amount of documents, various opi...
متن کاملPolNet - Polish WordNet: Data and Tools
This paper presents the PolNet-Polish WordNet project which aims at building a linguistically oriented ontology for Polish compatible with other WordNet projects such as Princeton WordNet, EuroWordNet and other similarly organized ontologies. The main idea behind this kind of ontologies is to use words related by synonymy to construct formal representations of concepts. In the paper we sketch t...
متن کاملIntroduction to Tools for IndoWordNet and Word Sense Disambiguation
Lexically rich resources form the foundation to all NLP tasks. Maintaining the high quality of resources is thus a high priority issue. In this paper we exhibit the tools developed at IIT Bombay, for the purpose of creation, enhancement and maintenance of the WordNets, as well as the ones used for NLP tasks that use WordNets directly, like Word Sense Disambiguation. The paper presents online an...
متن کامل